A universal neocortical mask for Centiloid quantification

Abstract INTRODUCTION The Centiloid (CL) project was developed to harmonize the quantification of amyloid beta (Aβ) positron emission tomography (PET) scans to a unified scale. The CL neocortical mask was defined using 11C Pittsburgh compound B (PiB), overlooking potential differences in regional distribution among Aβ tracers. We created a universal mask using an independent dataset of five Aβ tracers, and investigated its impact on inter‐tracer agreement, tracer variability, and group separation. METHODS Using data from the Alzheimer's Dementia Onset and Progression in International Cohorts (ADOPIC) study (Australian Imaging Biomarkers and Lifestyle + Alzheimer's Disease Neuroimaging Initiative + Open Access Series of Imaging Studies), age‐matched pairs of mild Alzheimer's disease (AD) and healthy controls (HC) were selected: 18F‐florbetapir (N = 147 pairs), 18F‐florbetaben (N = 22), 18F‐flutemetamol (N = 10), 18F‐NAV (N = 42), 11C‐PiB (N = 63). The images were spatially and standardized uptake value ratio normalized. For each tracer, the mean AD–HC difference image was thresholded to maximize the overlap with the standard neocortical mask. The universal mask was defined as the intersection of all five masks. It was evaluated on the Global Alzheimer's Association Interactive Network (GAAIN) head‐to‐head datasets in terms of inter‐tracer agreement and variance in the young controls (YC) and on the ADOPIC dataset comparing separation between HC/AD and HC/mild cognitive impairment (MCI). RESULTS In the GAAIN dataset, the universal mask led to a small reduction in the variance of the YC, and a small increase in the inter‐tracer agreement. In the ADOPIC dataset, it led to a better separation between HC/AD and HC/MCI at baseline. DISCUSSION The universal CL mask led to an increase in inter‐tracer agreement and group separation. Those increases were, however, very small, and do not provide sufficient benefits to support departing from the existing standard CL mask, which is suitable for the quantification of all Aβ tracers. HIGHLIGHTS This study built an amyloid universal mask using a matched cohort for the five most commonly used amyloid positron emission tomography tracers. There was a high overlap between each tracer‐specific mask. Differences in quantification and group separation between the standard and universal mask were small. The existing standard Centiloid mask is suitable for the quantification of all amyloid beta tracers.

sufficient benefits to support departing from the existing standard CL mask, which is suitable for the quantification of all Aβ tracers.

HIGHLIGHTS
• This study built an amyloid universal mask using a matched cohort for the five most commonly used amyloid positron emission tomography tracers.
• There was a high overlap between each tracer-specific mask.
• Differences in quantification and group separation between the standard and universal mask were small.
• The existing standard Centiloid mask is suitable for the quantification of all amyloid beta tracers.

INTRODUCTION
The Centiloid (CL) project is a standardized method to harmonize amyloid beta (Aβ) quantification for positron emission tomography (PET) images. It not only provides a standard processing pipeline, along with standard masks for neocortical retention and reference regions, but also provides a framework to anchor different tracers and processing pipelines to the same reference values. Using the provided neocortical mask, reference region mask, and a set of published transforms, the five most commonly used Aβ PET tracers can be quantified in Centiloids using the statistical parametric mapping (SPM) pipeline. [1][2][3][4][5] Recent advancements in medical imaging technology have enabled the development of more advanced model-based methods for generating and improving CL quantification such as non-negative matrix factorization 6,7 , AmyQ 8 , Aβ index, 9 and amyloid load (Amyloid IQ ), 10 which all use a model fitted to the entire image to perform the quantification. These methods use advanced machine learning techniques to improve inter-tracer agreement, reduce longitudinal variability, and improve group separation. However, these advanced techniques are limited to research settings and most clinical applications and clinical trials still rely on the use of the standard quantification pipelines (SPM or other similar well validated pipelines) and the associated quantification masks given their simplicity and availability.
One of the potential limitations of the standard neocortical mask is that it was defined using a single tracer, 11 C-Pittsburgh compound B (PiB), not accounting for potential differences in regional distribution among Aβ PET tracers. While all five most commonly used Aβ tracers have demonstrated high affinity and specificity for fibrillar Αβ in plaques, 11 and in vitro comparisons have found that all tracers bind to similar binding sites, 12,13 differences in tracer affinity and degree of non-specific binding could lead to slight differences in regional distribution. Using 11 C-PiB, one of the tracers with the highest affinity and lowest non-specific binding as a reference to define the cortical areas to be sampled, could potentially result in the inclusion of regions where binding is not detectable using other tracers. This could increase the noise and reduce the specificity of the other Aβ PET tracers.
In this work, we aim to build a new universal neocortical CL mask based on all five Aβ tracers and evaluate its impact on intertracer agreement, tracer variability, and group separation using both cross-sectional and longitudinal data, compared to the standard CL mask.

Population selection
Using data from the ADOPIC study, mild AD patients were selected using the following criteria: clinical diagnosis of AD (with AIBL and ADNI using the National Institute of Neurological and Communica-

PET analysis
All PET images from the ADOPIC study were smoothed to a uniform 8 mm resolution to reduce the influence of different scanner sharpness on the derived masks. The images were then spatially normalized to the Montreal Neurological Institute template using the standard SPM CL pipeline. 2 The spatially normalized images were then mirrored to remove any asymmetry. Standardized uptake value ratio (SUVR) normalization was performed using the CL whole cerebellum mask (WCb) as reference region. Mean AD and HC images were then computed for each tracer along with a corresponding difference image (AD-HC).
While different thresholds for the difference images could be explored, this was not the primary aim of this work. Instead, each tracer's threshold was defined so that the resulting mask maximizes the overlap with the original CL mask. This was implemented using a Powel optimizer seeking to maximize the Dice similarity score, 20 which is used as a measure of masks overlap. The Dice similarity score was selected in this application as it is commonly used to optimize segmentation models.
Finally, the universal mask was defined as the intersection of all tracerspecific masks. The universal mask was then used to recalibrate the CL equation for PiB using the Global Alzheimer's Association Interactive Network (GAAIN) PiB dataset of young controls (YC) and mild AD, followed by each tracer using their respective PiB/ 18 F-tracer pairs from the GAAIN dataset.

Evaluation
Paired t tests were used to assess differences in MMSE, CDR, age, and CL between the matched HC and AD for each tracer. Cohen d

Systematic Review:
The authors reviewed the literature using traditional (e.g., PubMed) sources and meeting abstracts and presentations. While amyloid beta (Αβ) positron emission tomography (PET) tracers' affinity and specificity for fibrillar Αβ in plaques have been compared in vitro, there is limited evidence of potential differences in regional distribution in vivo.

Interpretation:
Our results indicate that using a universal neocortical Centiloid mask led to marginal improvements using our chosen metrics, indicating that a universal mask is not required and that the existing standard mask is suitable for the quantification of all Αβ PET tracers.
3. Future Directions: While this article only focused on the target region, a similar exploration should be conducted to choose the optimal reference region for each tracer.
was used to compute the corresponding effect size. Chi-square was used to assess differences in sex distribution. Analysis of variance was used to assess if there were any differences in MMSE, CDR, age, and CL between the HC (and AD) participants selected for each tracer.
Similarly, chi-square was used to assess differences in sex distribution between the HC (and AD) participants selected for each tracer.
The standard and universal masks were first evaluated on the GAAIN dataset in terms of inter-tracer correlation using the coefficient of determination (R 2 ) and variance in the YC. They were then evaluated on the ADOPIC baseline population to measure its impact on the separation between HC, MCI, and AD, assessed using Cohen d, and its correlation with MMSE. The separation between HC, MCI, and AD was also evaluated using the measures of longitudinal rate of change. Last, Spearman ρ was used to assess the correlation between the baseline CL and rate of change (CL/Yr).
For comparison, the same experiments were also conducted with each tracer quantified using its own tracer-specific mask.

RESULTS
For each tracer, the number of matched HC/AD pairs were as follows:    The universal mask, defined as the intersection of all five masks is presented in Figure 2 along with the standard mask, and their overlaps and differences. There is a good overlap between the universal and standard mask (Dice = 0.74). The universal mask was, however, slightly narrower than the standard mask, especially in the frontal lobe, resulting in a 26% smaller volume.
The variance in the GAAIN YC and the correlation between the 18 F-Tracer/ 11 C-PiB pairs are presented in Table 3. The variance in the YC CLs was systematically lower using the universal mask compared to using the standard mask for all tracers (3.4% lower on average). The 18 F-Tracer/ 11 C-PiB correlations in the head-to-head subsets were also slightly higher when using the universal mask (0.24% higher on average).
Using each tracer's specific mask did not reduce the variance in the YC compared to the universal mask (Table S2 in supporting information). While it improved the 18 F-Tracer/ 11 C-PiB correlations for FBP and NAV, it was decreased for FBB and FLT (Table S3 in supporting information).
The mean baseline CL and rate of changes are presented in Table 4, along with the group separation and correlation with MMSE in the ADOPIC dataset. The differences in CL at baseline between the standard and universal masks were < 1% for each tracer, and ≈1% for each clinical group. Using the universal mask on the ADOPIC dataset led to a slightly higher effect size at baseline between HC and MCI as well as HC and AD. The differences in effect size were, however, quite small (< 1%). The annualized rate of CL/Yr was slightly higher in the HC (+0.8%) and MCI (+2.2%) when using the universal mask, but lower in the AD (−5%). While the universal mask led to a higher effect size between HC and AD (+10%), it did not improve the separation between HC and MCI (−13%). Similarly, the correlation between CL and MMSE at baseline did not improve when using the universal mask, although the difference was < 0.5%.
It should also be noted that both sets of CL values were highly correlated, with a R 2 = 0.999 between the CLs obtained using the standard Using the tracer-specific masks did not improve the effect size at baseline compared to using the universal mask (Table S4 in supporting information). While the separation between HC and AD using the annualized rate of CL/Yr increased, it got worse between HC and MCI. The correlation between CL and MMSE at baseline was slightly improved.

DISCUSSION
We have proposed a novel and tracer-unbiased universal CL mask based on the five most commonly used Aβ tracers. This new mask is built as the intersection of the masks derived from the AD-HC difference images derived from each tracer, and therefore ensures that only regions where all five tracers measure Aβ are included. By defining the threshold based on the overlap with the standard mask, we also ensured that each mask has a similar extent to that of the standard mask.
Our matching procedure ensured that there were no differences in age or sex between the matched pairs of AD and HC. There were also to its lower dynamic range. The resulting masks also showed good concordance across tracers.
The universal mask was narrower than the standard mask, resulting in a sampling that avoids more cerebrospinal fluid (CSF) and white matter. Nevertheless, the standard and universal mask had a good overlap, with a Dice of 0.74.
Using the universal mask on the GAAIN calibration dataset led to a smaller variance in the YC and improved correlation between each head-to-head 11 C-PiB/ 18 F-Tracer datasets. As stated earlier, less sampling of white matter and CSF by the universal mask might explain the reduced variance in the YC; only sampling regions that are common to all five tracers also likely helped to improve the correlations in the paired dataset.
In the ADOPIC dataset, while using the universal mask increased the group separation between HC/MCI and HC/AD at baseline, those increases were very small (< 1%). The results were also mixed when using the longitudinal rate of change, increasing the HC/AD group separation, but decreasing the HC/MCI one.
As there is no ground truth for Aβ semi-quantification, it can be dif-  21 or even cross-sectional analysis. 7 Future work looking at the optimal reference region for each tracer is therefore warranted.

TA B L E 4
Mean CL at baseline and rate of CL change per year for each mask along with the corresponding group separation between the clinical groups, as well as correlation with MMSE for both the standard CL mask and the universal CL mask (higher effect size and higher R 2 are marked in bold font).

CONCLUSIONS
The universal CL mask led to an increase in inter-tracer agreement and group separation. Those increases were, however, relatively small indicating that a universal mask is not required, and that the existing standard CL mask is suitable for the quantification of all Aβ tracers.

CONSENT STATEMENT
All AIBL participants gave written consent for publication of deidentified data. All ADNI participants signed written informed consent for participation in the ADNI, as approved by the institutional revies board at each participating center. All OASIS-3 participants consented to the use of their data by the scientific community and data sharing terms have been approved by the Washington University Human Research Protection Office.